A novel nonlinear grey Bernoulli model NGBM(1,1,t^p,α) and its application in forecasting the express delivery volume per capita in China

The grey prediction is a common method in the prediction. Studies show that general grey models have high modeling precision when the time sequence varies slowly, but some grey models show low modeling precision for the high-growth sequence. The paper researches the grey modeling for the high-growth sequence using the extended nonlinear grey Bernoulli model NGBM(1,1,t⌃p,α). To improve the nonlinear grey Bernoulli model NGBM(1,1,t⌃p,α)’s prediction precision and make data have better adaptability to the model, the paper makes improvements in the following three aspects: (1) the paper improves the accumulated generating sequence of original time sequence, i.e. making a new transformation of traditional accumulated generating sequence; (2) the paper improves the model’s structure, extends the grey action and builds an extended nonlinear grey Bernoulli model NGBM(1,1,t⌃p,α); (3) the paper improves the model’s background value and uses the value of cubic spline function to approximate the background value. Because the parameters of the new accumulated generating sequence transformed, the nonlinear grey Bernoulli model’s time response equation and the background value are optimized simultaneously, the prediction precision increases greatly. The paper builds an extended nonlinear grey Bernoulli model NGBM(1,1,t⌃2,α) using the method proposed and seven comparison models for China’s express delivery volume per capita. Comparison results show that the extended nonlinear grey Bernoulli model built with the method proposed has high simulation and prediction precision and shows the precision superior to that of seven comparison models.


Introduction
The grey system theory proposed by Chinese scholar Jvlong Deng in 1982 is mainly applied to the studies on problems with less data and poor information. The grey prediction is an important component of grey system theory. Currently, it has been used widely in economic fields like industry, agriculture and commerce, and other fields, such as environment, energy, society and military, but it has big prediction errors sometimes. In this case, to improve the grey model's prediction precision, many scholars have made various studies on the grey prediction model. Cai et al. [1] proposed an asphalt pavement performance prediction method based on fractional grey model, which adopted fractional cumulative generation operation instead of traditional cumulative generation operation. Hu [2] proposed a GM(1,1) model based on residual correction for energy prediction. Different from the usual practice, the residual was not created independently. Experimental results showed that the proposed residual GM(1,1) model performed well compared with other residual GM(1,1) models. Cheng et al. [3] established a GM(1,1) model of China's total industrial crude oil consumption by improving the parameter estimation of the model through four methods without changing the model structure. Liu et al. [4] proposed a constant stress accelerated degradation test and combined prediction model based on the Wiener process GM(1,1) model in order to predict the residual current device's residual life. To improve the prediction accuracy of landslide deformation, Huang et al. [5] proposed an improved deformation prediction algorithm based on GM(1,1) model and empirical mode decomposition. To accurately predict medium-and long-term renewable energy generation, He [6] established a structurally adaptive discrete grey prediction model with new information priority. Wu et al. [7] studied CO2 emissions of BRICS countries (Brazil, Russia, India, China, and South Africa) by using a fractional non-homogeneous grey model, and the new model was developed to predict CO2 emissions of BRICS countries from 2019 to 2025. Wang et al. [8] proposed a new adaptive grey model of Caputo fractional derivative structure and a new Caputo fractional cumulative generator in order to accurately predict the future energy trend. By comparing optimization algorithms, particle swarm optimization algorithm was selected to optimize the model parameters. To predict photovoltaic energy, Yu et al. [9] developed a new grey system model, which takes into account the time-delay power effect with high flexibility, so as to deal more effectively with small and complex time series and has a more general formula. Three practical cases of energy prediction verified the effectiveness of the new model. The actual case showed that their proposed model was superior to the existing eight grey prediction models. Wang et al. [10] proposed a new adaptive grey Chebyshev polynomial Bernoulli model with fractional order structure to accurately predict renewable energy in China, which was based on NGBM(1,1) model to describe nonlinear phenomena. Xiang et al. [11] established the PM10 short-term prediction model using the grey system theory. They adopted Euler polynomial as the driver of the grey model, then adopted the convolution solution to make the new model computationally feasible, and adopted the grey wolf optimizer to optimize and select the nonlinear parameters of the model. Xiang et al. [12], aiming at the uncertainty of sewage discharge prediction in China, introduced the hyperbolic time delay term on the basis of grey system theory, established a new prediction model, determined the key nonlinear parameters of the model by grasshopper optimization algorithm, and proved the reliability of the model through a series of practical examples. The above scholars improved the grey model from different angles to improve the prediction accuracy. The grey models generally used include the GM(1,1) model [13], the GM(1,N) model [14], the GM(N,1) model [15] and the nonlinear grey Bernoulli model [16,17]. In the models, the nonlinear grey Bernoulli model is an important type, which has the strong adaptability to data fitting and thus has higher prediction precision generally. However, it's hard to solve the power exponent and its parameters in the model, so the research on the nonlinear grey Bernoulli model is difficult and develops slowly.
Some scholars have researched the nonlinear grey Bernoulli model early. Wang et al. [18], starting from the whitening equation of the nonlinear grey Bernoulli model, solved the specific expression of the power index by using the grey derivative information covering principle, and discussed the relevant properties of the nonlinear grey Bernoulli model. However, the formula for solving the power index was relatively complex. Wang [19] improved the grey derivative of nonlinear grey Bernoulli model to improve the modeling accuracy. Wang et al. [20] optimized the power index and the parameters in the time-response formula of the nonlinear grey Bernoulli model through software programming to improve the modeling accuracy, but did not give the specific formula for solving the power index. Zhang and Chen [21] improved the modeling accuracy by processing the background value and initial conditions, but ignored the exponential part of the time response when they abstracted the curve into a non-homogeneous exponential function in the process of optimizing the background value. Hu [22] optimized the background value through data transformation. Compared with the traditional nonlinear grey Bernoulli model, although this method improved the modeling accuracy, it ignored the constant term in the time response.
In recent years, some scholars have made more in-depth theoretical and application studies on the nonlinear grey Bernoulli model. Cheng and Liu [23], Ma and Wei [24] extended the grey action amount of the traditional nonlinear grey Bernoulli model and proposed the extended nonlinear grey Bernoulli model, and gave the modeling method, and the examples showed that their proposed extended nonlinear grey Bernoulli model had high accuracy. Luo and Chen [25] proposed a nonlinear grey Bernoulli model with linear time-varying parameters for the prediction of non-equidistant time series. Zeng [26], Huang et al. [27] proposed a new nonlinear grey Bernoulli model building method to solve the problem that the traditional nonlinear grey Bernoulli model was not good at predicting the oscillation time series. Examples showed that their proposed model and method had high accuracy. Ding et al. [28] simultaneously optimized the power exponent and initial conditions of the nonlinear grey Bernoulli model, obtained the model parameter values by minimizing the average error, and verified the effectiveness and practicability of the model with an example. Zhou et al. [29] proposed a new time-varying grey Bernoulli model for accurate prediction of electric vehicles inventory and sales. They first designed time-varying parameters and power indices to explore the nonlinear development trend of the series, and then determined the optimal solution by using the cuckoo search algorithm. Wu et al. [30] proposed a new nonlinear grey Bernoulli model with a combinatorial component accumulation generator to predict oil consumption and end-consumption in China, considering the uncertainty and volatility of the petroleum system. The newly designed model combined the traditional fraction accumulation with the fitting fraction accumulation and introduced a combinatorial fraction accumulation generator. Compared with the old accumulation, the new optimized accumulation can improve the flexible mining ability of the development law of time series. Zeng et al. [31] proposed a matrix nonlinear grey Bernoulli model based on interval sequence in order to accurately predict power generation. To make the model directly applicable to interval sequence, the traditional grey Bernoulli model was linearized by power function transformation, then the parameters were matrixed, and finally the recursive prediction formula was obtained by Cramer's law. The new model can be used to predict exponential interval series, saturation interval series and parabolic interval series. Yin and Mao [32] proposed a new fractional multivariate grey Bernoulli model for short-term power load prediction. The fractional differential equation and fractional cumulative generation sequence were integrated into the new model. Secondly, they improved the grey wolf algorithm, so that many parameters in the model were better optimized. Tong et al. [33], To accurately predict natural gas consumption, started from the traditional nonlinear grey Bernoulli model, expanded the development coefficient and grey action, proposed a new adaptive time-varying grey Bernoulli prediction model, derived the model's time-response formula, and discussed the relationship between model parameters and model accuracy. Then, the nonlinear parameters and power parameters of the new model were optimized by genetic algorithm, and the validity of the model was analyzed by predicting the consumption of natural gas in China, the United States and Russia.
The scholars mainly improved the model in terms of grey derivative, power exponent, initial value, background value, model extension and parameter optimization to further improve the precision of model, but there were some defects in the modeling for some data, such as the high-growth sequence, manifested as the poor modeling precision. There are, essentially, the following three main factors affecting the modeling precision: first, the data should adapt to the variation law of model; second, the model should meet the variation characteristics of data; third, the method should be chosen properly. To improve the modeling precision for highgrowth sequence, the paper makes improvements in the following three aspects: (1) the paper improves the accumulated generating sequence of original time sequence, i.e. making a new transformation of traditional accumulated generating sequence, to enhance the high-growth sequence's adaptability to the model; (2) the paper improves the model's structure and extends the grey action, i.e. building an extended nonlinear grey Bernoulli model NGBM (1,1,t⌃p,α), to make the model adapt to the variation of high-growth sequence; (3) the paper improves the model's background value and uses the cubic spline function value to approximate the background value to enhance the estimation precision of parameter.
Because the parameters improved in the three aspects are optimized simultaneously, the precision of model increases greatly. To avoid big average simulation relative error (MAPE 1 ) or big average prediction relative error (MAPE 2 ) of the model, we make the objective function be the minimum of max(MAPE 1 , MAPE 2 ).
It should be noticed that the improvement in the accumulated generating sequence of original time sequence has many transformation forms and the extension of the model's grey action also has many manifestation forms, so how to find the proper form should be studied further.
The paper first gives the modeling method of traditional nonlinear grey Bernoulli model, then proposes the form of extended nonlinear grey Bernoulli model NGBM(1,1,t⌃p,α), next gives the method to build an extended nonlinear grey Bernoulli model NGBM(1,1,t⌃p,α), i.e. giving the time response equation for prediction and its parameter optimization method, and finally builds an extended nonlinear grey Bernoulli model NGBM(1,1,t⌃p,α) using the method proposed and other seven models for China's express delivery volume per capita. Results show that the extended nonlinear grey Bernoulli model built with the method proposed has high simulation and prediction precision and shows the precision superior to that of other seven models in comparisons. The traditional nonlinear grey Bernoulli model writes z (2) (k) as (z (1) (k)) α approximately, then have

The method to build the traditional nonlinear grey Bernoulli model
x ð0Þ ðkÞ þ az ð1Þ ðkÞ ¼ bðz ð1Þ ðkÞÞ a : ð4Þ ðkÞÞ. Definition 2 [34]: Call x (0) (k) + az (1) (1) (k)) α the grey differential equation of nonlinear grey Bernoulli model.  : ð6Þ Theorem 2 [34]: The time response function of nonlinear grey Bernoulli model iŝ The estimate of power exponent α can be obtained with the optimization method generally. According to the time response function sequence, we can get the simulation value and prediction value of original sequence fromx ð0Þ ðkÞ ¼x ð1Þ ðkÞ Àx ð1Þ ðk À 1Þ.

The form of extended nonlinear grey Bernoulli model NGBM(1,1,t⌃p,α)
To make data have better adaptability to the model, the paper first improves the accumulated generating sequence of x (0) (t). Then, have the following definition. Definition 4 [35]: Suppose the original time sequence is Especially when g = 1, h = 0, sequence x (r) becomes x (0) , so for a specific grey model, if it meets the rules of traditional accumulated generating sequence, it must meet the rules of new accumulated generating sequence. Therefore, improving the accumulated generating sequence of x (0) (t) can improve modeling precision. In particular, the improvement in modeling accuracy is greater for high-growth sequences.
Next, the paper extends the structure of traditional nonlinear grey Bernoulli model, extending the grey action (x (r) (t)) α to be ) α to make the model have the generality to be adaptive to more variations of data. Then, have the following definitions. In particular, the adaptability of high-growth sequences to the model is enhanced.
To compare the method proposed with the grey models proposed in other referencing documents in terms of modeling precision, the paper makes calculations.
Build a model using the improved method of extended nonlinear grey Bernoulli model given by referencing document [23] and then get the following time response equation: Calculate and get the simulation value and prediction value of original sequence witĥ x ð0Þ ðkÞ ¼x ð1Þ ðkÞ Àx ð1Þ ðk À 1Þ, and the results are shown in Table 2. Table 2 shows the relative errors and average relative error in the periods.

PLOS ONE
where x (1) (t) is the once accumulated generating column of original sequence and y (1) (t) is the once accumulated generating column of GDP per capita. We calculate and get the simulation values and prediction values of original sequence in these periods withx ð0Þ ðtÞ ¼x ð1Þ ðtÞ Àx ð1Þ ðt À 1Þ and the results are shown in Table 3. See Table 3 for the relative errors and average relative errors in the periods.
Next, we build non-grey models. First, we build an ARIMA model. We get the ARIMA (1,2,2) model through calculations: ð1 À BÞr 2xð0Þ ðkÞ ¼ 0:742394 þ ð1 þ 0:373654B À 0:0294134B 2 ÞεðkÞ: ð61Þ We calculate and get the simulation and prediction values of original sequence in these periods with the model, and the results are shown in Table 4. See Table 4 for the relative errors and average relative errors in the periods.
Then, we build a lagged regression model. We calculate and get x ð0Þ ðkÞ ¼ 1:792897x ð0Þ ðk À 1Þ À 0:7099944x ð0Þ ðk À 2Þ þ 0:7478515 þ εðkÞ: We calculate and get the simulation and prediction values of original sequence in these periods with the model, and the results are shown in Table 4. See Table 4 for the relative errors and average relative errors in the periods. Fig 1 is the histogram of relative errors of eight models built for China's express delivery volume per capita. Models 1*8 are the traditional nonlinear grey Bernoulli model, the model proposed, the model of document [23], the model of document [36], the model of document [37], the model of document [38], the ARIMA model, and the time-lagged regression model. We can see that the model proposed has the highest modeling precision and the precision superior to that of other models.

The robustness evaluation
The robustness evaluation refers to a statistical evaluation on the stability of the model built.
We suppose that, for the model, the parameter estimate is η, the residua sum of squares is s, the sub-sample variance is m 2 s , the degree of freedom is n s and the population variance is d 2 s . We give η a small perturbation Δ and record Q = η + Δ. We consider Q as the parameter value of model to get the fitting residual of r, and consider corresponding sub-sample variance as m 2 r , freedom degree as n r and population variance as d 2 r to construct the statistics F * F(n s − 1, n r − 1). We suppose H 0 : � Fa 2 ðn s À 1; n r À 1Þ . Therefore, the model is robust.

The precision evaluation
The precision evaluation is an evaluation on the fitting accuracy of model. It generally analyzes the practical fitting results and calculates some statistics, such as the model's coefficient of For the extended nonlinear grey Bernoulli model established in this paper, we calculate and get the coefficient of determination R 2 = 0.9972 close to 1, the mean absolute relative error MARE = 2.70% less than 3%, and the Theil index of inequality μ = 0.0272 close to 0, so the model has high precision. Table 1 gives the calculation results of the traditional nonlinear grey Bernoulli model built with the conventional method and the extended nonlinear grey Bernoulli model built with the method proposed. The results show that the nonlinear grey Bernoulli model built with conventional method has big average simulation and prediction relative errors (7.94% and 13.31%,

PLOS ONE
respectively); the extended nonlinear grey Bernoulli model NGBM(1,1,t⌃2,α) built with the method proposed has small average simulation and prediction relative errors (2.70% and 2.69%, respectively). It can be seen that the extended nonlinear grey Bernoulli model NGBM (1,1,t⌃2,α) built with the method proposed has high precision and is superior to the traditional nonlinear grey Bernoulli model built with the conventional method significantly. Table 2 gives the calculation results of the grey models built with the methods given by referencing documents [23,36]. The results show that x (0) calculated with the method of referencing document [23] has small average simulation and prediction relative errors (4.89% and 4.88%, respectively) comparatively, but its precision is lower than the calculation precision of the extended nonlinear grey Bernoulli model NGBM(1,1,t⌃2,α) built with the method proposed; x (0) calculated with the method of referencing document [36] has big average simulation and prediction relative errors (12.59% and 4.48%, respectively), but its precision is lower than that of the nonlinear grey Bernoulli model NGBM(1,1,t⌃2,α) built with the method proposed significantly. Table 3 gives the calculation results of the grey models built with the modeling methods of referencing documents [37,38]. The results show that x (0) calculated with the method of document [37] has big average simulation and prediction relative errors (10.88% and 10.89%, respectively), and its precision is significantly lower than that of the extended nonlinear grey Bernoulli model NGBM(1,1,t⌃2,α) proposed; x (0) calculated with the method of document [38] has a big average simulation relative error (15.99%), and although with a small average prediction relative error (2.28%), its overall precision is significantly lower than that of the extended nonlinear grey Bernoulli model NGBM(1,1,t⌃2,α) built with the method proposed. Table 4 gives the calculation results of the non-grey models. The results show that the ARIMA model built has big average simulation and prediction relative errors (12.84% and 9.94%, respectively), and its precision is significantly lower than the calculation precision of the extended nonlinear grey Bernoulli model NGBM(1,1,t⌃2,α) built with the method proposed; the time-lagged regression model built has a big average simulation relative error (13.49%), and although with a small average prediction relative error (2.73%), the model's overall precision is significantly lower than that of the extended nonlinear grey Bernoulli model NGBM(1,1,t⌃2,α) built with the method proposed.
Therefore, Tables 1 * 4 show that the extended nonlinear grey Bernoulli model NGBM (1,1,t⌃2,α) built with the method proposed has the precision significantly higher than that of the nonlinear grey Bernoulli model built with the traditional method and superior to that of the models built with the methods of referencing documents [23,[36][37][38], the ARIMA model and the time-lagged regression model. It indicates that the extended model and method proposed in the paper have high reliability and effectiveness.
The express delivery volume per capita is a high-growth sequence, and studies have shown that general grey models are adapted to the variations of low-growth sequence. The common grey modeling methods rarely transform the accumulated generating sequence. Some scholars have extended the grey action of grey model, but the extended model failed to present the variation law of high-growth sequence. The paper improves the accumulated generating sequence to make the high-growth sequence adapt to the variation law of nonlinear grey Bernoulli model, and extends the grey action of nonlinear grey Bernoulli model simultaneously to make the improved model present the variation characteristics of high-growth sequence. In addition, we use the cubic spline interpolation function as the background value to make the model have better fitting elasticity. We optimize the parameters in the three aspects simultaneously, to make the model have higher precision overall.